146 research outputs found
Automating Coreference: The Role of Annotated Training Data
We report here on a study of interannotator agreement in the coreference task
as defined by the Message Understanding Conference (MUC-6 and MUC-7). Based on
feedback from annotators, we clarified and simplified the annotation
specification. We then performed an analysis of disagreement among several
annotators, concluding that only 16% of the disagreements represented genuine
disagreement about coreference; the remainder of the cases were mostly
typographical errors or omissions, easily reconciled. Initially, we measured
interannotator agreement in the low 80s for precision and recall. To try to
improve upon this, we ran several experiments. In our final experiment, we
separated the tagging of candidate noun phrases from the linking of actual
coreferring expressions. This method shows promise - interannotator agreement
climbed to the low 90s - but it needs more extensive validation. These results
position the research community to broaden the coreference task to multiple
languages, and possibly to different kinds of coreference.Comment: 4 pages, 5 figures. To appear in the AAAI Spring Symposium on
Applying Machine Learning to Discourse Processing. The Alembic Workbench
annotation tool described in this paper is available at
http://www.mitre.org/resources/centers/advanced_info/g04h/workbench.htm
BioCreAtIvE Task 1A: gene mention finding evaluation
<p>Abstract</p> <p>Background</p> <p>The biological research literature is a major repository of knowledge. As the amount of literature increases, it will get harder to find the information of interest on a particular topic. There has been an increasing amount of work on text mining this literature, but comparing this work is hard because of a lack of standards for making comparisons. To address this, we worked with colleagues at the Protein Design Group, CNB-CSIC, Madrid to develop BioCreAtIvE (Critical Assessment for Information Extraction in Biology), an open common evaluation of systems on a number of biological text mining tasks. We report here on task 1A, which deals with finding mentions of genes and related entities in text. "Finding mentions" is a basic task, which can be used as a building block for other text mining tasks. The task makes use of data and evaluation software provided by the (US) National Center for Biotechnology Information (NCBI).</p> <p>Results</p> <p>15 teams took part in task 1A. A number of teams achieved scores over 80% F-measure (balanced precision and recall). The teams that tried to use their task 1A systems to help on other BioCreAtIvE tasks reported mixed results.</p> <p>Conclusion</p> <p>The 80% plus F-measure results are good, but still somewhat lag the best scores achieved in some other domains such as newswire, due in part to the complexity and length of gene names, compared to person or organization names in newswire.</p
Critical Assessment of Information Extraction Systems in Biology
An increasing number of groups are now working in the area of text mining, focusing on a wide range of problems and applying both statistical and linguistic approaches.
However, it is not possible to compare the different approaches, because there
are no common standards or evaluation criteria; in addition, the various groups
are addressing different problems, often using private datasets. As a result, it is
impossible to determine how well the existing systems perform, and particularly what
performance level can be expected in real applications. This is similar to the situation
in text processing in the late 1980s, prior to the Message Understanding Conferences
(MUCs). With the introduction of a common evaluation and standardized evaluation
metrics as part of these conferences, it became possible to compare approaches, to
identify those techniques that did or did not work and to make progress. This progress
has resulted in a common pipeline of processes and a set of shared tools available to
the general research community. The field of biology is ripe for a similar experiment.
Inspired by this example, the BioLINK group (Biological Literature, Information
and Knowledge [1]) is organizing a CASP-like evaluation for the text data-mining
community applied to biology. The two main tasks specifically address two major
bottlenecks for text mining in biology: (1) the correct detection of gene and protein
names in text; and (2) the extraction of functional information related to proteins
based on the GO classification system. For further information and participation
details, see http://www.pdg.cnb.uam.es/BioLink/BioCreative.eval.htm
How to Evaluate your Question Answering System Every Day and Still Get Real Work Done
In this paper, we report on Qaviar, an experimental automated evaluation
system for question answering applications. The goal of our research was to
find an automatically calculated measure that correlates well with human
judges' assessment of answer correctness in the context of question answering
tasks. Qaviar judges the response by computing recall against the stemmed
content words in the human-generated answer key. It counts the answer correct
if it exceeds agiven recall threshold. We determined that the answer
correctness predicted by Qaviar agreed with the human 93% to 95% of the time.
41 question-answering systems were ranked by both Qaviar and human assessors,
and these rankings correlated with a Kendall's Tau measure of 0.920, compared
to a correlation of 0.956 between human assessors on the same data.Comment: 6 pages, 3 figures, to appear in Proceedings of the Second
International Conference on Language Resources and Evaluation (LREC 2000
Overview of BioCreAtIvE: critical assessment of information extraction for biology
<p>Abstract</p> <p>Background</p> <p>The goal of the first BioCreAtIvE challenge (Critical Assessment of Information Extraction in Biology) was to provide a set of common evaluation tasks to assess the state of the art for text mining applied to biological problems. The results were presented in a workshop held in Granada, Spain March 28–31, 2004. The articles collected in this <it>BMC Bioinformatics </it>supplement entitled "A critical assessment of text mining methods in molecular biology" describe the BioCreAtIvE tasks, systems, results and their independent evaluation.</p> <p>Results</p> <p>BioCreAtIvE focused on two tasks. The first dealt with extraction of gene or protein names from text, and their mapping into standardized gene identifiers for three model organism databases (fly, mouse, yeast). The second task addressed issues of functional annotation, requiring systems to identify specific text passages that supported Gene Ontology annotations for specific proteins, given full text articles.</p> <p>Conclusion</p> <p>The first BioCreAtIvE assessment achieved a high level of international participation (27 groups from 10 countries). The assessment provided state-of-the-art performance results for a basic task (gene name finding and normalization), where the best systems achieved a balanced 80% precision / recall or better, which potentially makes them suitable for real applications in biology. The results for the advanced task (functional annotation from free text) were significantly lower, demonstrating the current limitations of text-mining approaches where knowledge extrapolation and interpretation are required. In addition, an important contribution of BioCreAtIvE has been the creation and release of training and test data sets for both tasks. There are 22 articles in this special issue, including six that provide analyses of results or data quality for the data sets, including a novel inter-annotator consistency assessment for the test set used in task 2.</p
Biocuration Workflow Catalogue
As the first phase of a knowledge engineering study of biocuration workflows, we performed a preliminary task-modeling exercise on seven separate bioinformatics systems. This involved constructing UML activity diagrams from detailed interviews with curators in order to understand the organization of the process the biocurators used to populate their system. The objective of this work was to identify common patterns within the workflows where we might apply text mining methods to accelerate curation. We compiled a number of workflows in a common format but were largely unable to consolidate these structures into a formal structure that facilitated comparison across workflows. We presented this work as a slideshow and publish this account of the catalog as supplementary information
- …